Evaluation of completeness and timeliness of data in the National Information System for Notifiable Diseases for spotted fever in the state of São Paulo, Brazil, 2007-2017

Abstract Objective: to evaluate the completeness and timeliness of notifications of cases of spotted fever (SF) held on the Notifiable Health Conditions Information System (SINAN) in São Paulo State, Brazil, from 2007 to 2017. Methods: this was a descriptive and ecological study of confirmed human cases of SF regarding completeness and timeliness of ten fields of the notification form (good if ≥ 90% for most variables); time series analysis was performed using the Prais-Winsten technique. Results: we analyzed 736 records; among essential fields, only “Discharge date” showed poor completeness (68.5%). Timeliness was good for the “Investigation” and “Closure” fields; other time lapses were not adequate. Conclusion: in São Paulo state, data completeness was good for most variables, whereas timeliness was not adequate (except for “Closure” and “Investigation”), pointing to the need for health education and communication actions about SF.


INTRODUCTION
Spotted fever (SF) is an acute febrile zoonotic disease, caused by bacteria species of the Rickettsia genus. 1 In São Paulo State, two distinct diseases related to SF have been identified: one caused by the Rickettsia rickettsii species, traditionally known as Brazilian SF, and the other caused by the Rickettsia parkeri bacterium, referred to as SF. 1 R. rickettsii is transmitted by ticks of the Amblyomma sculptum and Amblyomma aureolatum species, while R. parkeri is transmitted by ticks of the Amblyomma ovale species. [2][3][4][5] The clinical picture of SF is characterized by different levels of severity and high case fatality ratio in humans, 6 whereby fatality can reach 80% in advanced forms if not treated. 7 The disease is an important public health problem, which is why it has been included on the Ministry of Health list of compulsorily notifiable diseases since 2001. 8 With effect from 2014, all SF cases, both suspected and confirmed, must be notified immediately on the Notifiable Health Conditions Information System (Sistema de Informação de Agravos de Notificação -SINAN) within 24 hours. 9,10 In Brazil, 2,293 SF cases were recorded between 2007 in 2020. 11 In São Paulo State, 12 936 cases were notified in the last five years, 549 (58.7%) of which progressed to death. The majority of cases notified in São Paulo state in that period were male (84.8%), the highest proportion corresponded to the 20-59 age group (520 cases; 55.6%), and the case fatality ratio was estimated to be 54.4%. 12 Production of high quality epidemiological information is both strongly recommended and highly desirable, in order for data analyses to able to represent the real magnitude and health status of an event in a given territory. 13 Completeness and timeliness are indicators of data quality used in performance reviews, and are recommended by national 14 and international 15 health authorities for identifying populations and areas at epidemiological risk, as well as assisting health action programming.
Completeness is understood to mean the proportion of fields (mandatory and/ or essential) filled in on data collection instruments. 13 Poor filling in of notification form fields leads to production of deficient and less reliable data, contributing to a poorer understanding of the dynamics of a disease, due to incorrect indicators of incidence, mortality and fatality, for example. 13,16,17 Timeliness, in turn, is the time lapse between different stages of the surveillance process and refers to the time taken by an Epidemiological Surveillance Service to obtain information in a timely and efficient manner, offering input for more accurate decision making by health authorities. 18 Analyzing "timeliness" can contribute to the improvement of epidemiological surveillance and health system information management, and to the identification of possible interfering factors related to health service users, health professionals and laboratories, such as access

Main results
In São Paulo state the majority of the spotted fever variables had good completeness; closure and investigation had good timeliness; and laboratory investigation timeliness was inadequate.

Implications for services
The study can contribute to better resource allocation in areas such as surveillance, health worker training, as well as adoption of regionalized health policies.

Perspectives
The situation analyses provide more information for public health authorities, including for the re-evaluation of activities commonly considered bureaucratic, with subsequent reflection in health indicators.

ORIGINAL ARTICLE
to health services, human resource training and sample processing time. 19 The objective of this study was to evaluate the completeness and timeliness of data notified on the SINAN system for SF cases in São Paulo State between 2007 and 2017, with the aim of contributing to improving the SF epidemiological surveillance process; as well as to analyze SINAN quality spatial distribution regarding timeliness, in order to identify discrepancies throughout São Paulo State. The SINAN SF notification form has 63 fields to be filled in, classified according to the breakdown provided by the information system data dictionary: (i) mandatory fields, whereby missing information implies noninclusion of the notification or investigation on the SINAN, and (ii) essential fields, the filling in of which is not mandatory, whereby missing information influences the calculation of epidemiological or operational indicators.

METHODS
The completeness of the database was evaluated for the following essential fields: "Date of hospital admission", "Date of discharge", "Date of first serological sample collection" and "Date of second serological sample collection". Among the essential fields, "Date of hospital admission" was selected as it indicates more severe cases that require hospitalization; the other fields served as indirect parameters for evaluation of the care provided to the individual, considering health surveillance guidelines. Moreover, when filling in these fields the "Unknown" option cannot be used. Time lapses greater than 365 days and negative time lapses were excluded from our analyses of timeliness.
Percentage completeness of each variable was calculated by dividing total filled in cases that were not null (excluding 'unknown' cases) by total confirmed cases, for each year of the study period. The percentage of timely notifications for each variable was obtained by dividing the number of notifications that met the time limit criterion by the total number of confirmed cases with valid notifications.
According to the parameters recommended by the United States Centers for Disease Control and Prevention (CDC) 15 and by the Brazilian Ministry of Health, 14 the following criteria were used to classify the data as to their completeness and timeliness -good (≥ 90.0%), regular (≥ 70.0% to < 90.0%) or poor (< 70.0%) -for all the variables; except for case "Closure" timeliness, for which the following classification was used -good (≥ 80.0%), regular (≥ 70.0% to < 80.0%) or poor (< 70.0%). 14 Due to the lack of established parameters in the literature about completeness and timeliness of "Serological analyses", we opted to use the "Closure" categories-values adopted by the Ministry of Health. 14 We prepared box-plots for each of the time lapses analyzed. The time lapses (in days) for the timeliness attribute were characterized by means of descriptive statistics (mean; standard deviation; median; minimum and maximum values).
The time trend analyses for completeness and timeliness were performed using the Prais-Winsten technique, which enables the fit of the logistic regression model using the ordinary least squares method. 21 First, the percentage curves found over time were visually analyzed. Prais-Winsten analysis was performed for each variable, which corrects for possible first-order autocorrelation. The time trend was considered to be rising if Annual Percentage Change (APC) was positive with a p-value < 0.05, falling if APC was negative with a p-value < 0.05, or stable when any APC had a p-value > 0.05.
We performed all statistical analysis using R software, versions 2.18.24 and 4.2.2.

São Paulo State is divided into 28
Epidemiological Surveillance Groups (Grupos de Vigilância Epidemiológica -GVE). 22 We undertook a study of spatial distribution per municipality of notification and per GVE with the purpose of evaluating completeness and timeliness in a regionalized manner. We opted to detail the timeliness parameters for the "Notification versus date of symptom onset", "Investigation", "Data input" and "Closure" variables, in relation to actions carried out by municipal health services. We prepared choropleth maps to represent the timeliness percentages, using the Quantum GIS application, version 3.

RESULTS
Between 2007 and 2017, 739 cases of SF were confirmed in São Paulo state. Three cases (0.4%), who did not reside in São Paulo state, were excluded from the present analysis. As such, we analyzed 736 cases, 77.2% (568) of which were autochthonous.
Essential field completeness was found to be good for "Date of first serological sample collection" (97.9%), "Date of hospital admission" (96.6%) and "Date of second serological sample collection" (90.3%). Completeness of the "Date of discharge" field was regular (84.4%). At least 59.5% of cases were reported in a timely manner, i.e. within seven days from symptom onset. Only 33% of cases had their first serological sample collected within 24 hours from the time of notification. More than 81% of confirmed cases (588 cases) were closed in a timely manner.
The time lapses (in days) between the onset of first symptoms, serological sample collections (1st and 2nd samples), input date, investigation, case closure, and the date of notification are shown in the box-plots in Figure 1. The mean, median, standard deviation, maximum and minimum values of the time lapses (in days) found by our timeliness analyses, by reporting year, are shown in Table 1. The greatest data dispersion was found for the "Data input" variable, which showed a six-day median and a mean ranging from 12.3 to 67.2 days. There was little data dispersion (median equal to zero: 0.0) between notification and first serological sample collection; and between notification and investigation, the means of which ranged from 1.2 to 9.3 days and from 0.0 to 9.8 days, respectively. Median time between notification and case closure was 34 days, also with little data dispersion. Little variability was also found in the notification data regarding the onset of symptoms and the collection of both serological samples The time trend analyses are shown in Figure 2 and Table 2. Only the "Discharge date" and "Date of second serological sample" completeness curves and the "Closure" timeliness curve had a rising time trend. The trend was stable for all the other variables.
Although its trends remained stable over the period, "investigation" had good timeliness in 2009, 2010, 2014 and 2016; while "closure' had good timeliness with effect from 2011. The same was not found for serological sampling and other timeliness variables.
Ninety-seven municipalities distributed over 23 Epidemiological Surveillance Groups (Grupos de Vigilância Epidemiológica -GVE) notified confirmed cases of SF; the exceptions were the Araçatuba, Franca, Franco da Rocha, Itapeva and Jales GVEs, where no cases were notified in the period studied. In our analysis by GVE, we examined timeliness of the "Notification", "Data input", "Investigation" and "Closure" fields, as shown in Figure 3. With regard to "Notification" (Figure 3A), only four GVEs had good timeliness: Barretos, Santos, São José dos Campos and São José do Rio Preto. In the case of "Data input" ( Figure  3B), six GVEs had good timeliness: Osasco, Araraquara, Barretos, Bauru, Presidente Venceslau and Ribeirão Preto. In relation to "Investigation" (Figure 3C), all the GVEs had good timeliness; except for the Osasco GVE, for which "Investigation" was classified as regular. Ten GVEs achieved good classification for "Closure" (Figure 3D).

Table 1 -Statistical description of the time lapses (in days) of the timeliness analyses of confirmed spotted fever cases by year of notification, São Paulo State, 2007-2017
Legend: sd = standard deviation; Max = maximum value; Min = minimum value.

DISCUSSION
The main finding of this study is that in the period from 2007 to 2017, the completeness of notifications of confirmed cases of SF was considered good 14 for most of the variables studied; however, the timeliness of laboratory "Investigation" could not be considered adequate. 14 In 2017, the municipalities where confirmed cases of SF were reported concentrated almost 60% of the state's population, this being a relevant finding.
The timeliness of SF case "Investigation" was considered good in São Paulo State in the period studied. This could result f rom adequate health surveillance actions, with identification of probable infection sites for this disease, detailed in São Paulo Epidemiological Bulletins. 23,24 The good completeness found for most variables, in more than half of the years studied, corroborates this result.
Another satisfactory finding of the study was the rising trend in completeness observed in two of the post-notification follow-up fields, "Date of discharge" from hospital and "Date of collection of second serological sample", as well as a rising trend in the timeliness of "Closure". The absence of a falling trend for timeliness in the other analyses is also relevant.
In the period 2007-2017, the increase in the completeness of the essential fields suggests improvement in the quality of information and knowledge about the disease. This may be the result of the Epidemiological Surveillance Service actively tracing notif ied cases to complete the information, reinforced by the obligation to immediately notify SF with effect from 2014. 9 The problem is relevant, because in Brazil, shortcomings in data filling have been reported in other studies on different notifiable diseases. 13,25,26 Several factors can interfere with the filling in of notification forms, such as (i) not being aware of the importance of the information collected, (ii) the perception that notification is a merely bureaucratic task, (iii) loss of motivation and work overload among the professionals involved, (iv) definition of other priorities by decision-making bodies, as well as variations according to the characteristics of the local health system. 16,17,27 Timely case investigation means that related actions were also undertaken, also in a timely manner, possibly including environmental research to identify the vector tick species and probable infection site. 7 The fact that almost all cases were investigated in a timely manner indicates that the Epidemiological Surveillance Service operated efficiently in most of the state during the entire period, contributing to the accuracy of knowledge of the epidemiological prof ile of the disease in the state and the distribution of the vector tick.
Good timeliness of suspected diagnosis is fundamental for guiding early correct antibiotic treatment, which can reduce the case fatality ratio. 6 Correct classification of the risk of disease occurring in the territory is also important, in order to support medical decisions.
The study showed that the timing of sample collection for serological analysis was a parameter that needs to be improved. This is essential information for classifying cases and, consequently, probable infection sites and areas of transmission. 24 Missing or inadequate laboratory information on the SINAN, or even its untimely input to the system, can lead to inaccurate results and jeopardize epidemiological investigation of cases. 7,14,23,24 Integration between SINAN databases and reference laboratory databases is recommended in order to get around this limitation, as this would result in better data quality.
The prolonged time lapse found between onset of symptoms and notification may be associated with the person's delay in seeking medical attention or may signal a delay in suspected diagnosis, not necessarily meaning a failure in the notif ication system. 19,26 This delay can interfere in case outcomes, as well

ORIGINAL ARTICLE
as interfering in actions to control disease and prevent infection risks in humans. 14,24 More than 80.0% of the cases were closed in a timely manner, providing a good evaluation of the epidemiological profile and, consequently, implementation of effective control and prevention measures to be adopted by the surveillance services. 28 It should be added that in this period, there was very close contact with the state surveillance system, according to which, over time, specific integrated actions were implemented between the central level of epidemiological surveillance and local health services to achieve the best approach to the disease. This may have been reflected in the good "Case closure" results found with effect from 2011. R e g a rd i n g s p a t i a l d i s t ri b u t i o n , t h e identification of five GVEs with no notifications is noteworthy. This situation needs specific investigation, because the presence of vector ticks has already been described in the entire state of São Paulo [3][4][5]10,29,30 as have hosts that spread the etiologic agent (dogs and capybaras). 3,4,6,10,23,24 The fact that "Notification" timeliness was found to be inadequate in more than half of the territories suggests the need for communication campaigns about SF directed to the population at risk, in addition to frequent training of health professionals.
"Investigation" timeliness was good in almost all territories in which notifications were made, suggesting the correlation of actions in a timely manner, starting with case recording. However, the timeliness of "Closure" and "Data input", commonly considered to be bureaucratic activities, are unsatisfactory in most of São Paulo State, which may imply the need to reassess the management of activities inherent to the process.
As limitations of this study, using a database that only contained confirmed cases did not allow us to analyze the quality of the database as a whole, including suspected and discarded cases. Moreover, it was not possible to resolve all the inconsistencies of all the cases made available -despite repeated efforts -although this only resulted in the exclusion of a tiny part (< 5%) of the initial database, with no impact on the results found.
This study allowed us to gain knowledge of the prof ile of SF data quality, providing information for public health authorities and their situation analysis of the disease over time, this being reflected in the related health indicators. As the first regionalized evaluation of these parameters in São Paulo state, this study contributes to the improvement of data collection about SF in the cities comprising the state's different GVEs. This allows better allocation of resources in surveillance areas, health professional training and adoption of health policies by the state's regions.
To date, this is the only study known to have evaluated completeness and different parameters of timeliness for data on SF held on the SINAN, including their spatial distribution. Finally, it should be emphasized that the analysis of the time lapses between serological sample collections from confirmed cases was an original initiative of this research. Spotted fever surveillance system in São Paulo

ORIGINAL ARTICLE AUTHOR CONTRIBUTIONS
Xavier DR and Pinter A contributed to the study concept and design, data analysis and interpretation, drafting and approving the final version of the article. Albuquerque MP contributed to data analysis and interpretation. Sousa-Carmo SVT contributed to data analysis and interpretation and with relevant critical reviewing of the intellectual content of the manuscript and approving the final version of the article. All the authors have approved the final version of the article and are responsible for all aspects thereof, including the guarantee of its accuracy and integrity.  (68,5%). "Investigación" y "Cierre" tuvieron buena oportunidad; otros intervalos no fueron adecuados. Conclusión: en São Paulo, la completitud fue buena para la mayoría de las variables, pero no adecuada en cuanto a la oportunidad, excepto para "Cierre" e "Investigación", lo que apunta a la necesidad de acciones de educación y comunicación en salud sobre FM.